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Title 

Integrated Circuit Containing Multiple Digital Signal Processors 

Field of the Invention 

The invention relates to digital signal processing, and more particularly to 
integrated circuits containing multiple digital signal processing cores. 

Background of the Invention 
[0001] Digital signal processors (DSPs) are computing devices that process data that 
has been converted from analog form to digital form. Among the functions typically 
performed by DSPs are compression and decompression of data and echo cancellation. 
In traditional applications, one DSP has typically been placed on one integrated circuit 
chip. Several advantages can be gained by placing multiple DSPs on a single chip rather 
than having only one DSP on a chip. First, the amount of space on a circuit board taken 
up by the DSPs is reduced. Under the traditional approach, if four DSPs were needed in 
a circuit, four separate chips would have to be placed on the circuit board. When four 
DSPs are placed on a single chip, only one chip is needed instead of four and the amount 
of space on the circuit board used by the DSPs is reduced accordingly. Electrical energy 
tends to be wasted by the random access memory, input/output, and other peripherals on 
each chip and particularly by the input/output ports. The use of a multi-DSP chip reduces 
this waste by reducing the number of chips on the board. Connections between the 
multiple DSPs on one chip do not need input/output circuits, but instead operate at the 
low internal power levels. Thus, the amount of power consumed by the circuit board and 
the amount of heat generated by the board are reduced. The reductions in space and 
energy consumption contribute to a cost savings for multi-DSP as opposed to single-DSP 



2 



chips. The use of multiple DSPs on a single chip instead of on separate chips also 
increases processing speed by reducing the distance between the DSPs and decreasing 
the number of interconnections among them. 

[0002] Prior to the development of the present invention, at least one chip was known 
to exist that improved on the traditional configuration by placing multiple DSPs on a single 
chip. The Texas Instruments TMS320VC5441 Fixed-Point Digital Signal Processor 
contains four DSPs in a single integrated circuit. The TMS320VC5441 is described in a 
Texas Instruments data manual, Literature Number SPRS122C, which is incorporated 
herein by reference. 

p= [0003] While the Texas Instruments TMS320VC5441 offers the advantages described 
Jf above, that chip also has several drawbacks. Communication between each DSP and a 
yi host processor outside the chip is achieved through a multiplexing unit connected to a 
s| host processor interface on each DSP subsystem. Because of the multiplexing function, 
« only one DSP can be accessed at a time, slowing down overall communication speed 
w within the chip. The presence of a host processor interface on each DSP subsystem 
jh adds to the complexity of the chip and increases the number of interconnections needed 
among the components on the chip. Also, the host processor interface on each DSP 
subsystem shares a data bus with a memory control unit. Because of this configuration, 
memory access speed is reduced when the host processor interface is active. The 
present invention overcomes these drawbacks while retaining the advantages previously 
described. 

Summary of the Invention 
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[0004] The present invention is a system-on-a-chip (SoC) integrated circuit containing 
multiple digital signal processors (DSPs). In an embodiment of the invention, hereafter 
referred to as the DSP/SoC, the integrated circuit includes two or more DSPs and a single 
host processor interface. Each DSP includes its own memory unit and a direct memory 
access (DMA) device. 

[0005] Each memory unit may include an instruction memory module and controller, a 
data memory module and controller, and two or more time division multiplexing devices 
serving as serial port interfaces to couple data to and from each data memory module 
through its DMA. 

[0006] In one embodiment, the DSPs used in the integrated circuit may be LSI Logic 
ZSP400 digital signal processors. 

[0007] In the various embodiments of the DSP/SoC, a test port complying with the 
Joint Test Action Group standard can be connected to all of the DSPs to perform testing 
and debugging functions. 

[0008] By placing more than one DSP on a semiconductor chip, the DSP/SoC system 
reduces the number of chips needed on a circuit board to perform digital signal 
processing functions. This reduction in the number of chips in turn leads to a decrease in 
power consumption and heat generation and a savings in costs. Processing speed is 
increased since the distance between DSPs and the number of interconnections among 
DSPs in decreased. In addition, the DSP/SoC chip uses only one host processor 
interface for the entire chip as opposed to one host processor interface per DSP as used 
by existing multi-DSP chips. This leads to a further increase in processing speed and a 
decrease in circuit complexity. Speed of memory access is increased in the DSP/SoC 
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system over existing technology since the' host processor interface and the memory 
control units do not share a common bus. 

Description of the Drawings 
[0009] The invention, together with further advantages thereof, may best be 
understood by reference to the following drawing in which: 

[0010] Figure 1 is a block diagram depicting a typical configuration of a DSP/SoC 
multiple digital signal processor integrated circuit. 

[0011] Figure 2 is a more detailed block diagram showing signal paths between the 
various elements of an integrated circuit according to the present invention. 

Detailed Description of the Invention 
[0012] The present invention is a system-on-a-chip (SoC) integrated circuit 10 
containing multiple digital signal processors (DSPs). A preferred embodiment of the 
invention, hereafter referred to as the DSP/SoC, is shown in Figure 1. An external host 
processor 12 sends commands and data through Host Processor Concentrator 14, also 
external, to a Host Processor Interface (HPI) 16 which is part of the DSP/SoC 
semiconductor chip 10. In this embodiment, the DSP/SoC 10 is employed to process and 
direct voice traffic in a communications system, and the host concentrator 14 routes voice 
data packets, along with data and commands from the host processor to the DSP/SoC. 
The HPI 16 controls four digital signal processor subsystems 18-21. In alternative 
embodiments, a different number of subsystems could be present. A phase-locked loop 
clock unit (PLL) 22 controls the timing of all elements of the DSP/SoC 10 and in particular 
provides timing signals to clock systems within DSP subsystems 18-21. A JTAG port 24, 
located inside the DSP/SoC chip 10, and a JTAG controller 26, located outside the 
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DSP/SoC chip, provide testing and debugging capabilities. The terms JTAG refers to the 
Joint Test Action Group IEEE 1149.1 boundary-scan standard. Eight T1/E1 framers 31, 
32, 33, 34, 35, 36, 37, 38, also located outside the DSP/SoC chip 10, provide input into 
and receive output from the DSP/SoC. In alternative embodiments a different number of 
framers 31-38 could be present or other interface devices such as H.100/H.110 devices 
may be used instead. 

[0013] In this embodiment, each digital signal processor subsystem 18-21 includes an 
LSI Logic ZSP400 open architecture digital signal processor core 41-44, an instruction 
memory area (I MEM) 46-49, a data memory area (DMEM) 50-53, a direct memory 
access (DMA) device 54-57, and two time division multiplexing (TDM) serial ports 61-68, 
respectively. The IMEMs 46-49 and DMEMs 50-53 each include an internal memory 
controller unit that also connects with the DSP cores 41-44, the HPI 16, and other 
peripherals. A common memory bus 70 provides the HPI 16 with access to the IMEMs 
46-49 and DMEMs 50-53. 

[0014] In the embodiment depicted in Figure 1, the IMEMs 46-49 have an address 
space of 64K with each addressed site storing 16 bits. The memories are organized so 
that 64 bits can be read per access. This allows four read and/or write instructions to be 
transmitted at one time. The DMEMs 50-53 depicted have an address space of 64K and 
a storage size of 16 bits per address. In alternative embodiments, memory modules 
having other sizes for the address spaces and storage spaces could be used. In further 
alternative embodiments, digital signal processors other than the ZSP400 could be used 
and a different number of TDM serial ports could be present. For purposes of this 
specification, the term ZSP400 refers to any LSI Logic digital signal processor. 
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[0015] The HPI 16 used in a preferred embodiment of the DSP/SoC 10 is a 16-bit 
interface that provides the off-chip host processor 12 with access to the memory modules 
46-49 and 50-53 of the DSP subsystems 41-44 and the DMA memory map. It is a 
passive interface that has a handshake protocol to work with the intelligent host 
concentrator 14 to provide a fast and effective data transfer. In alternative embodiments, 
other types of host processor interfaces could be used. 

[0016] A common internal bus 70 connects the HPI 16 to the instruction memories 46- 
49 and data memories 50-53 in all four subsystems 18-21. By means of this bus 
structure, the HP1 16 provides the host processor concentrator 14, and therefore the host 
processor 12, with access to the instruction memory 46-49 and data memory 50-53 in 
Mf each of the subsystems 18-21. Using the HPI 16, it is possible for the host processor 12 
I?, to place program instructions (e.g., an echo canceling algorithm) into the instruction 
Sj memories 46-49 of the DSP subsystems 18-21, and to place data (e.g., digital filter 
* coefficients) into the data memories 50-53 of the DSP subsystems 18-21. This is typically 
p done during the initialization and configuration of the DSP/SoC 10 by the host processor 
f: 12, immediately following the application of power to the IC 10. Initialization is generally 
r * necessary because memories are typically "volatile" - i.e., they do not retain instructions 
or data when power is removed. Consequently, if power to the DSP/SoC 10 is turned off, 
the contents of these memories must be restored when the IC is activated again. During 
the initialization process the DSPs 41^4 may be held in reset, so that they do not attempt 
to execute program instructions from the instruction memories 46-49. Once the host 12 
has completed initialization, the DSPs 41-44 are released from reset to begin normal 
execution. In addition to program instructions and data required by the DSPs 41-44, the 
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DMA controllers 54-57 and TDM serial ports 61-68 may rely on configuration data 
contained in data memories 50-53, which must be established by the host processor 12 
during initialization. 

[0017] During normal operation of the DSP/SoC chip 10, after the initial programming 
of the DSP subsystems 18-21 is complete, digital data signals are input through the 
T1/E1 framers 31-38 into the TDM serial ports 61-68. Each framer 31-38 inputs data into 
one TDM serial port 61-68. Data from multiple TDM serial ports 61-68 then feeds into a 
DMAs 54-57. In Figure 1 , two TDM serial ports, e.g. 63,64, are shown feeding into one 
DMA, e.g. 55, but in alternative embodiments more than two TDM serial ports could feed 
into a single DMA. Each DMA 54-57 then sends the data to a DMEM 50-53. A DSP 41- 
44 acts on the data using the instructions stored in its respective IMEM 46-49. The DSP 
41-44 then sends the processed data back to the DMEM 50-53. The HPI 16 polls the 
DSP 41-44 for the completion of the processing of a frame of data. If processing is 
complete, the DMA 54-57 retrieves the processed data from the DMEM 50-53 and sends 
it to the TDM serial ports 61-68. The TDM serial ports 61-68 then send the processed 
data back to the T1/E1 framers 31-38. 

[0018] The DMA units 54-57 include a descriptor based, multichannel, indexed DMA 
controller which reduces the interrupt overhead during data transfers among pairs of 
devices in any of the three buses. To enhance the use of the TDM serial ports 61-68, the 
indexed DMA channels perform sequential or indexed accesses to or from the internal 
Data Memory 50-53of the Subsystems 18-21. These channels are designed specifically 
to work with the TDM serial ports 61-68. Data buffers can read from or write to DSP Data 
memory corresponding to logical TDM channels (time slots). The user specifies the 
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buffer length and the number of buffers to service, and the DMA 54-57 controller 
automatically updates the pointer for each transfer within a frame. When a frame transfer 
completes, the pointer updates the memory address and begins transferring data for the 
next frame. When the DMA channel pointer reaches the last location of the last buffer, an 
interrupt is generated to the requester and the DMA transaction is terminated. This 
feature effectively automates the distribution of data from different time slots of the 
incoming TDM stream to a set of designated buffers. 

[0019] The TDM serial ports 61-68 are synchronous serial ports that support 8 or 16- 
bit active or passive transfers. They allow a glueless interface to a T1/E1 framer devices 
or H.100/H.110 interface devices. Their control registers, input data and output data 
registers are memory-mapped and the DMA units 54-57 can transfer data directly 
between the serial port input and output registers and dual-access RAM simultaneously 
with other processor operations. 

[0020] Figure 2 provides a more detailed block diagram and signal flow chart for the 
DSP/SoC 10 of the present embodiment. To simplify Figure 2, certain external elements 
of Figure 1 are omitted as follows: host processor 12, controller 26, and framers 31-38. In 
addition, details of DSP subsystems 19-21 are omitted since they are parallel to 
subsystem 18. Start up and operation of the DSP/SoC will be described with reference to 
Figure 2. 

[0021] The DSP/SoC 10 starts operating when power is turned on and the hardware 
Reset signal is de-asserted by the Host Processor. Upon start-up, a reset control in the 
HPI 16 control registers hold all DSP subsystems 18-21 in reset mode. DSP Cores 41, 
etc., DMAs 54, etc, memories 46, 50, etc, and TDM serial ports 61, 62, etc. are held in 
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Reset and in IDLE state. During this time, the HPI 16 communicates with the host 
processor 12 to perform self-test using BIST and JTAG. The HPI 16 is then used by the 
host processor 12 to store data in the DSP subsystem memory 46, 50, etc. The data are 
stored via memory controller interface. The data stored to the subsystem's instruction 
memory 46, etc. is used to configure/program the DSP Cores. The data stored to data 
memory 50, etc. is used to configure DMA 54, etc. and TDM serial ports 61, 62, etc.. The 
HPI 16 has a broadcast mode that allows part or all of the DSP subsystems 18, etc. to 
get configuration parameters and or instruction code at the same time. When all devices 
in all DSP subsystems 18, etc. are configured and the programs are store in instruction 
memory 46, etc., the reset control in the HPI 16 control register is asserted to bring the 
DSP subsystems 18, etc. out of reset. An individual DSP subsystem, e.g. 18, or all 
subsystems can be brought out of reset the same time. 

[0022] When a DSP subsystem 18, etc. comes out of reset, it will await a frame of 
data from its DMA 54, etc. to process (for receive direction) or data from HPI 16 to 
process (for transmit direction). The DMA 54, etc. and HPI 16 notify a DSP core when it 
has every channels' frame of data in a DSP subsystem's data memory 50, etc. ready for 
DSP core 41, etc. to process. The notification is done via an Interrupt control signal. 
Upon Interrupt notification, the DSP cores perform a data processing process that was 
stored in its instruction memory, for example a voice codec algorithm. The results of the 
DSP core's data processing is then stored to data memory. The host processor 12 polls 
the status of DSP cores 41, etc. to recognize the completion of processing a frame and 
read it (for receive direction) or instruct DMA 54 etc. to get it and send to TDM serial ports 
61 ,62, etc. (for transmit direction). 
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[0023] The data memory units 51-53 are set up in circular buffer banks with 
programmable circular buffer pointer that allow the HPI 16, DMA 54-57, and DSP cores 
41-44 to access without collision. There are three 8Kx16 and four 4Kx16 banks per 
subsystem so that TDM data, HPI data and Core data can access without interference of 
the current frame's data. 

[0024] The use of a single HP1 16 for the entire multi-DSP chip 10 rather than an HPI 
for each DSP 41-44 reduces the complexity of the DSP/SoC system 10. The single HPI 
16 used in the DSP/SoC system 10 has the capability to broadcast instructions directly to 
all DSPs 41-44 simultaneously or, through the use of chip select signals, it can send 
instructions to any one, two, or three at a time. This eliminates the need for a multiplexing 
unit to act as an intermediary between a host processor 12 and the DSPs 41-44. Fewer 
interconnections among components are needed, complexity is reduced, and 
programming is simplified in the DSP/SoC system 10 as opposed to existing technology 
since only one HPI 16 is used and no multiplexor is present. Also, because the HPI 16 
does not share a bus with the memory modules, the HPI and the memory modules can 
be active simultaneously with no loss of data processing speed. 

[0025] The JTAG test port 24, complying with the Joint Test Action Group (JTAG) 
standard, also known as IEEE Standard 1149.1, is connected to all of the DSPs 41-44 in 
the DSP/SoC system 10 and to the HPI 16 to perform testing and debugging functions. 
The JTAG port provides access to all on-chip resources. A ZSP400 in-circuit emulator 
(ICE) can be operated via the JTAG port 24 to allow full visibility and control of all ZSP400 
cores. The JTAG port 24 also has the capability to read from and write to all memory in 
the system while the system is running by multiplexing into the DSP/SoC system 10. 
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[0026] While the present invention has been illustrated and described in terms of 
particular apparatus and methods of use, it is apparent that equivalent parts may be 
substituted for those shown and other changes can be made within the scope of the 
present invention as defined by the appended claims. 
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